aggregate regret
Country:
- North America > United States > Indiana (0.04)
- North America > United States > Illinois (0.04)
- North America > Canada (0.04)
Technology:
Country:
- North America > United States > Indiana (0.04)
- North America > United States > Illinois (0.04)
- North America > Canada (0.04)
Technology:
Thresholding Bandit with Optimal Aggregate Regret
Tao, Chao, Blanco, Saùl, Peng, Jian, Zhou, Yuan
We consider the thresholding bandit problem, whose goal is to find arms of mean rewards above a given threshold $\theta$, with a fixed budget of $T$ trials. We introduce LSA, a new, simple and anytime algorithm that aims to minimize the aggregate regret (or the expected number of mis-classified arms). We prove that our algorithm is instance-wise asymptotically optimal. We also provide comprehensive empirical results to demonstrate the algorithm's superior performance over existing algorithms under a variety of different scenarios.
1905.11046
Country:
- North America > United States > Illinois (0.04)
- North America > United States > New York (0.04)
- North America > United States > Indiana > Monroe County > Bloomington (0.04)
Technology: